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This  paper  deals  with  one  method  of  orga¬ 
nizing  the  storage  unit  of  a  descriptor  IPS 
( information  retrieval  system)  of  th*3  SPOD  type, 
the  information  array  of  which  constitutes  the 
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Classificatior.al  data  determine  the  position  of  ■.  toe  ament  in 
the  storage  system.  They  include  bibliographic  info ■■;:•  ion  and  th* 
storage  address  of  the  document  in  the  i nformation-retrieval  array. 

Base  data  designate  objects,  phenomena,  or  actions,  the 
characteristic  of  which  are  the  document.  It  is  necessary  i.  .n.te 
that  these  data  may  not  be  included  in  the  title  of  the  document. 

.7  da“a  are  the  necessary  set  of  woras  from  the  text  of  the 
document,  sufficiently  accurate  (for  retrieval  purposes)  reflecting 
the  details  of  its  contents. 

Like  'ey  data  base  data  are  the  reflection  of  the  contents  of  a 
document  in  its  retrieval  form.  The  separation  of  base  data  into  a 
separate  category  is  connected  with  the  practice  of  request  formula¬ 
tion.  It  is  possible  to  assume  that  the  category  of  data  called 
"base"  may  appear  only  during  the  formulation  of  a  request,  since 
the  consumer,  for  one  reason  or  another,  does  not  aiw  a  y  s  k  n  o  w  t ;  i  e 
details  of  the  contents  of  n  document.  At  the  same  tin.-  the  fo»  m 
of  the  representation  of  these  categories  in  the  retrieval  form  of 
the  document  varies. 


Let  us  examine,  in  a  concrete  example,  the  above  proposed 
categories  and  the  relationship  of  the  separate  data  within  them. 

Let  us  assume  that  the  array  contain;,  reference  data  about  various 
types  of  semiconductors,  while  the  document  -  reference  data  about  the 
?;.A  transistor  jii]. 
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The  classi t ieational  data  of  the  retrieval  form  of  this  document 
include  bibliographic  information  and  the  document’s  address  in  the 
information-retrieval  system  utilized. 

For  the  examined  retrieval  form  of  the  document,  the  descriptor 
"?W  belongs  to  the  category  of  base  data. 

The  contents  of  the  document  in  the  examined  example  have  been 
divided  into  these  sections:  "assignment,"  "forming,"  "general 
data,"  "maximum  permissible  electrical  data,"  etc.  Each  section  is 
described  by  a  group  of  key  data.  Here,  some  of  these  data,  besides 
the  purely  semantic  data,  Dear  quantitative  information.  For  example, 

in  the  section  "general  data"  these  concepts  are  used: 

greatest  height  -  10  mm 
greatest  diameter  —  31  mm,  etc. 

The  character  of  the  informational  array  permits  determining  the 
assumed  character  of  the  basic  quantity  of  requests  connected  with 
che  operation  of  retrieval:  from  a  known  series  of  defined  data  it 
is  necessary  to  obtain  the  concrete  address  of  the  document  and 
certain  information  contained  in  it  in  the  form  of  a  descriptor 
description  and  numerical  values. 

It  is  possible  to  assume  the  following  basic  types  of  requests 
to  the  IPS  with  the  above  array. 

1.  From  the  designation  of  the  object,  establish  the  values  of 
its  defined  criteria. 

2.  From  the  defined  criteria  of  the  object,  establish  the 
values  of  other  defined  criteria. 

3.  From  defined  criteria,  establish  the  designation  of  the 
object . 

4.  Check  whether  the  concrete  object,  the  designation  or  series 
of  criteria  of  which  are  known,  possesses  other  defined  criteria. 
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values  of  certain 

6.  Issue  documentary  information  (in  the  form  of  a  concrete 
address  of  the  document  from  the  document ) which  possesses  defined 
criteria  [10]. 

The  realization  of  similar  requests  is  connected  with  the 
fulfillment  of  a  number  of  retrieval  operations,  which  as,  for 
example,  sorting  of  data  by  a  defined  series  of  criteria,  establishing 
t  the  correspondence  of  the  request  to  the  document  under  the  condition 
of  entry  (nonentry)  of  the  defined  series  of  request  criteria  in 
the  retrieval  form  of  the  document,  and  so  forth. 

Two  basic  methods  of  realizing  retrieval  operations  are  known: 
direct  and  inverse.  The  specific  character  of  the  information 
material,  containing  a  significant  number  of  data  with  concrete 
cun.  .tative  values,  and  also  the  specific  character  of  the  requests, 
based  on  concrete  quantitative  values  of  defined  data,  render  inverse 
search  (in  its  pure  form)  practically  unacceptable  in  view  of  the 
difficulty  of  realization.  Therefore,  for  the  examined  type  of 
information  arrays,  it  is  apparently  more  expedient  to  use  the  method 
of  realisation  of  retrieval  operations.  It  is  necessary  to  note  the 
possibility  of  realization  of  retrieval  with  a  combination  of  the 
direct  and  inverse  systems.  At  first  the  inverse  method  aetermines 
the  addresses  of  zones  with  retrieval  forms  corresponding  to  the 
document  request;  then  by  direct  method,  from  these  retrieval  forms 
the  necessary  data  are  selected.  The  proposed  variant  is  expedient 
when  meeting  two  conditions: 

1)  the  selection  of  retrieval  forms  of  documents  is  made 
according  to  semantic  values  of  the  data  without  using  their  quantita¬ 
tive  values, 

2)  the  storage  unit  of  retrieval  forms  of  documents  has  random 
access. 


which  values  of  the  assigned  criteria  the  indicated 
other  criteria  occur. 


The  figure  shows  the  structure  of  the  retrieval  form  of  a 
document  selected  as  an  example. 
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The  descriptors  which  make  up  the  retrieval  form  of  the  document 
are  connected  by  sort-type  relationships  and  are  situated  on  three 
hierarchical  levels.  Requests  to  the  system  can  be  formulated  on 
any  of  them  .  Here  on  level  1  the  contents  of  the  entire  retrieval 
form  should  be  issued;  on  level  2,  the  contents  of  the  corresponding 
section,  and  on  3  -  the  value  of  separate  documentary  data. 

One  of  the  practical  possibilities  of  realization  of  sort-type 
relations  in  the  system  is  giving  each  descriptor  of  the  retrieval 
form  the  criterion  of  the  level.  In  this  case,  inside  each  retrieval 
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With  by-aspect  organization ,  for  each  aspect  or  group  of  aspec 
of  a  document  its  own  retrieval  form  is  composed.  In  unis  case,  to 
each  document  several  retrieval  forms  may  correspond,  each  of  which 
reflects  only  the  separate  aspects  of  its  contents. 

The  selection  of  this  or  that  organization  of  tne  storage  unit 
is  rmu.ud  by  the  assumed  character  of  the  basic  Quantity  of 
re-guests  which,  for  an  answer,  require  total  examination  of  the 
document  or  an  examination  of  only  us  individual  aspects. 

With  direct  organization  of  retrieval,  the  retrieval  images  of 
documents  are  recorded  in  the  storage  unit  in  the  form  of  retrieval 
zones.  Here  each  documentary  category  of  data  must  be  given  an 
individual  criterion.  Argumentation  of  the  necessity  for  alloting 
a  criterion  is  reveaied  when  examining  the  structure  of  the  request 

Any  request  to  the  retrieval  system  contains: 

.1  )  a  certain  set  of  data  for  setting  the  correspondence  of  a 
certain  set  of  documents  to  the  request, 

?)  a  certai"  set  of  criteria  of  data  which  are  tne  answer  to 
the  request. 

Thus,  for  an  answer  to  the  request  "Give  the  brand  of  semi¬ 
conductor  trioae  (transistor)  which  is  used  for  low  frequency  power 
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:ne  request  in  such 


"  it  is  necessary  to  rework  the  text  o 
a  way  that  instead  of  the  words  "bond,"  "semiconductor,"  and  "trioae," 
in  the  retrieval  device  a  criterion  was  introduced  which  corresponded 
to  the  collection  of  these  words.  This  reworking  of  the  request 
test  can  either  be  done  manually,  with  the  aid  of  translation  tables, 
or  automatically  with  a  special  dictionary. 


It  should  be  noted  that  with  such  an  organization  of  the  retrieval 
zone,  the  answer  includes  all  of  the  descriptors  of  the  retrieval 
form  noted  by  the  criterion  indicated  in  the  request.  Thus,  if  the 
bibliographic  information  and  the  address  of  the  document  arc-  given 
in  the  retrieval  zone  under  one  criterion,  "classification  data,"  then 
to  the  request  "Indicate  the  concrete  address  of  the  document"  the 
system  given  an  answer  in  which,  besides  the  address,  bibliographic 
information  also  will  be  shown  —  i.e„,  in  the  answer  noise  will 
appear. 

To  lower  the  noise  emitted  by  the  system,  it  is  necessary  to 
detail  the  criteria  of  the  accompanying  data. 

One  of  the  methods  of  organizing  documentary  data  in  the 
retrieval  zone  is  tableization  -  representing  these  data  in  form  in 
which  the  semantic  meaning  of  each  is  determined  by  its  position  in  the 
ordered  list.  Tableization  should  be  used  if  there  is  a  sufficiently 
large  group  of  documents  with  a  common  series  of  similar  (in  meaning) 
data,  which  has  various  quantitative  value. 

In  tableization,  in  the  retrieval  zone  of  the  storage  unit  the 
numerical  values  and  reference  numbers  of  corresponding  data  are 
filed  from  the  combined  ordered  table  of  their  semantic  meanings. 

The  process  of  data  tableization  when  compiling  retrieval  zones  can 
be  automated  by  including  a  combined  ordered  table  in  the  automatic 
translation  dictionary  from  the  natural  language  to  the  information 
retrieval  language  (IPYa).  Of  course,  tableized  data  in  the 
retrieval  zone  must  have  the  criterion  "table." 
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c orrespor.dirg  words  of  the  retrieval  zone  would  have  double  informa¬ 
tion.  However,  a  semantic  table  in  comparison  with  ar.  ordinary 
descriptor  description  gives  no  essential  advantage,  since  the 
number  of  words  in  the  retrieval  zone  does  not  change,  and  the  volume 
of  the  combined  liot  of  tableized  data  increases. 


The  essential  question  of  organizing  the  information  storage 
unit  of  an  SPOT  is  the  translation  of  words  of  the  natural  language 
of  documents  into  IPYa.  Here  we  must  consider  that  the  translation 
method  must  be  useful  both  for  the  text  of  the  document  as  well  as 
for  the  text  of  requests.  This  translation  can  be  carried  out 
manually  or  automatically.  Furthermore,  during  translation  into  the 
IPYa  additional  classificational  operations  can  be  carried  out.  Here 
the  texts  are  given  additional  organization  by  means  of  preliminary 
processing,  leading  to  indexing. 


Ir.  the  simplest  case  the  IPYa  of  a  system  can  be  natural  language 
translated  into  machine  form.  Here,  to  decrease  the  word  length, 
different  artificial  methods  of  compressing  word  codes  [3]  are  used. 


An  inherent  difficiency  of  this  method  is  the  need  for  the 
consumer  to  use,  when  compiling  the  request,  the  same  words  of  the 
natural  language  which  were  used  to  compile  the  retrieval  form,  under 
the  condition  that  this  set  of  words  utilized  in  the  retrieval  form 
is  unknown  to  the  consumer.  Use  of  this  method  is  expedient  in 
systems  with  a  formalized  setting  [Translators  Note:  This  is  the 
literal  translation  of  the  word  "usroyavshiysya. "  No  appropriate 
translation  found  in  available  sources.]  language. 

In  highly  organized  systems,  dictionaries  are  used.  Here  the 
words  of  the  document  and  the  request  are  translated  into  the  IPYa 
in  dictionary  terms,  which  eliminates  losses  during  retrieval  due  to 
ambiguity  of  words  utilized. 
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Furthermore ,  the  dictionary  can  realize  basis-type  and  textual 
relationships ;  it  can  also  give  additional  organization  to  the  set 
of  words  of  the  retrieval  ana  request  forms. 

The  use  of  textual  relationships  in  the  SPOD  is  apparently 
inexpedient,  since  according  to  the  results  of  experimental 
investigations,  the  positive  effect  obtained  with  their  use  does  not 
justify  the  necessary  complications  of  the  system  for  their 
realization  [2], 

Moreover,  it  is  possible  to  assume  that  the  specific  character 
of  the  information  array  of  the  SPOD,  other  things  being  equal, 
ensures  an  additional  drop  of  information  noise  when  answering  a 
request . 

The  dictionary  ensures  transls  '.or*  of  the  words  of  the  natural 
language  into  IPYa  by  means  of  consecutive  comparison  of  input  words 
with  the  total  volume  of  words  used  in  the  dictionary. 

The  problem  of  synonymy  is  solved  by  including  in  the  dictionary 
a  group  of  synonyms  and  conditionally  equivalent  words. 

The  problem  of  homonymy  and  polysemy  is  solved  by  combining 
the  word  of  the  homonym  with  a  group  of  words  which  explain  its 
semantic  meaning. 

The  problem  of  calculating  base-type  relationships  is  solved 
by  introducing  classes  of  classificational  criteria  into  the 
descriptor . 

Classificational  criteria  determine  the  hierarchic  belonging  and 
sort-type  relationships  of  data  of  the  information  array. 

The  classificational  criterion  joins  a  series  of  subcriteria: 

1)  of  the  belonging  of  data  to  one  of  three  categories: 
classificational,  base,  or  key; 
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3)  of  the  level,  if  divisions  of  the  document  are  disposed 
at  different  levels; 

4 )  of  the  table  for  indicating  the  entry  of  corresponding  aata 
in  a  class  of  tableized  data; 

5)  of  the  quantity  of  words  in  a  word  combination,  translating 
into  IPYa  by  one  descriptor  (for  example,  the  word  combination 
"maximum  permissible  electrical  data"  is  translated  into  IPYa  by 

one  descriptor); 

6)  of  the  group  of  synonyms  or  conditionally  equivalent  words; 

7)  of  a  word  from  a  group  of  synonyms,  subject  to  print  out 
when  using  the  dictionary  in  the  mode  of  translation  from  IPYa  to  the 
natural  language,  in  the  mode  of  output  of  results. 


In  the  dictionary  digits  which  are  allocated  for  recording  the 
translation  of  a  word  of  a  language  into  IPYa,  either  the  serial 
numbers  of  the  tableized  data  which  are  the  translation  of  the  semantic 
meaning  of  this  data  into  the  IPYa  are  stored.  The  quantitative 
values  of  tableized  data  are  not  recorded  in  the  dictionary. 


During  tne  formation  of  the  classificational  criterion  it  is 
possible  to  manage  without  the  subcriterion  "table"  during  the  follow¬ 
ing  organization  of  tableization.  The  serial  numbers  used  as 
descriptors  during  translation  to  IPYa  of  semantic  data  are  limited, 
for  example,  from  below  by  a  specified  number  "a."  In  this  case, 
tabular  data  are  translated  by  descriptors,  the  values  of  which  are 
disposed  in  the  form  cf  serial  numbers  up  to  this  number.  The 
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belonging  of  a  word  bo  the  table  is  determined  in  this  case  by  a 
special  analyser  by  ;he  criteron  of  the  value  of  the  descriptor, 
lying  within  the  limits  of  from  one  to  ”a." 

The  above  dictionary  organization  places  corresponding  conditions 
on  the  organization  of  the  storage  unit.  Each  descriptor  of  the 
storage  unit  has  a  classificationai  criterion  with  corresponding 
subcriteria.  In  the  digits  allocated  for  recording  of  the  value  of 
the  descriptor  either  the  serial  numbers  of  the  descriptors  in  the 
dictionary  are  recorded,  or,  in  the  case  of  tableization,  the 
numerical  values  of  tableizea  data  ar^  recorded. 


The  semantic  meanings  of  tableizea  data  are  recorded  in  the  digits 
of  the  subcriterion  "table"  in  the  form  of  the  serial  number  of  the 
semantic  meaning  of  this  data  in  the  ordered  tableizea  list. 


As  was  mentioned  above,  the  request  includes  a  group  of  words 
for  a  description  of  the  criterion  of  data  which  are  subject  to 
being  output  as  an  answer. 


This  group  of  words,  including  synonyms  and  conditionally 
equivalent  woros  ,  enters  into  tne  supplemental  dictionary  ~ r. 

distinction  from  the  basic  composition  of  the  dictionary,  the  words 
of  the  supplemental  list  are  translated  into  I?Ya  only  by  classifica- 
tior.al  criterion.  Since  one  and  the  same  word  can  enter  into  both 
groups  of  the  dictionary  list  (basic  and  supplmental  and,  consequently, 
they  have  to  be  translated  into  IPYa  differently,  it  is  necessary 
to  introduce  a  criterion  of  the  group  of  dictionary  composition. 

Tne  belonging  of  a  word  to  this  or  that  group  is  established  during 
input  of  the  request  by  singling  out  that  part  of  it  which  pertains 
to  the  description  of  the  criterion  of  issued  data.  In  the  information 
storage  unit  this  criterion  is  absent. 


When  using  the  proposed  method  of  organization  of  the  information 
storage  unit,  the  SPOD  ensures  output  of  data  corresponding  to  the 
level  shown  in  the  request,  and  also  lying  below  this  level,  but 
pertaining  to  a  higher  section  of  the  document. 
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vV)  This  paper  heals  with  one  method  cf  organising  tne  storage 
uni:  of  a  descrip *. or  i?3  (information  retrieval  system)  of  t  he¬ 
dge  type,  the  inf orm.a: icn  array  of  which  constitutes  the  total¬ 
ity  c:  uniform  documents  with  ordered  disposition  of  data  wi  hin 
each  of  them,  fhree  categories  of  data  composing  the  retrieval 
form  cf  a  document  have  been  defined  ciassificat ional,  base  and 
key.  Tor  the  organisation  of  retrieval  in  a  group  of  documents 
.in  a  comst.cn  series  of  similar  cat  a,  use  is  r'co.'tjter.aea  of  tne 
tablet  nation  method,  the  process  of  which  can  be  automated  by 
inducing  a  free,  ordered  table  in  the  automatic  dictionary  cf 
translation  from  the  natural  language  to  tne  I?Ya  (information 
retrieval  language).  Organisation  of  the  information  storage 
unis  anticipates  the  inclusion  in  each  of  its  descriptors  of  a 
class  if  icatior.al  criterion  with  corresponding  subcriteria.  The 
SPOT  ensures  output  of  data  which  corresponds  to  the  level  shown 
in  the  request,  as  well  8o>  data  below  this  level,  but  pertaining 
to  a  higher  section  of  the  document. 


